40 research outputs found
Learning representations for Information Retrieval
La recherche d'informations s'intéresse, entre autres, à répondre à des questions comme: est-ce qu'un document est pertinent à une requête ?
Est-ce que deux requêtes ou deux documents sont similaires ? Comment la similarité entre deux requêtes ou documents peut être utilisée pour améliorer
l'estimation de la pertinence ? Pour donner réponse à ces questions, il est nécessaire d'associer chaque document et requête à des représentations interprétables
par ordinateur. Une fois ces représentations estimées, la similarité peut correspondre, par exemple, à une distance ou une divergence qui opère dans l'espace de représentation.
On admet généralement que la qualité d'une représentation a un impact direct sur l'erreur d'estimation par rapport à la vraie pertinence, jugée par un humain.
Estimer de bonnes représentations des documents et des requêtes a longtemps été un problème central de la recherche d'informations.
Le but de cette thèse est de proposer des nouvelles méthodes pour estimer les représentations des documents et des requêtes, la relation de pertinence entre eux et ainsi modestement avancer l'état de l'art du domaine.
Nous présentons quatre articles publiés dans des conférences internationales et un article publié dans un forum d'évaluation. Les deux premiers articles concernent des méthodes qui créent l'espace de représentation selon une connaissance à priori sur les caractéristiques qui sont importantes pour la tâche à accomplir. Ceux-ci nous amènent à présenter un nouveau modèle de recherche d'informations qui diffère des modèles existants sur le plan théorique et de l'efficacité expérimentale. Les deux derniers articles marquent un changement fondamental dans l'approche de construction des représentations. Ils bénéficient notamment de l'intérêt de recherche dont les techniques d'apprentissage profond par réseaux de neurones, ou deep learning, ont fait récemment l'objet. Ces modèles d'apprentissage élicitent automatiquement les caractéristiques importantes pour la tâche demandée à partir d'une quantité importante de données. Nous nous intéressons à la modélisation des relations sémantiques entre documents et requêtes ainsi qu'entre deux ou plusieurs requêtes. Ces derniers articles marquent les premières applications de l'apprentissage de représentations par réseaux de neurones à la recherche d'informations. Les modèles proposés ont aussi produit une performance améliorée sur des collections de test standard. Nos travaux nous mènent à la conclusion générale suivante: la performance en recherche d'informations pourrait drastiquement être améliorée en se basant sur les approches d'apprentissage de représentations.Information retrieval is generally concerned with answering questions such as: is this document relevant to this query?
How similar are two queries or two documents?
How query and document similarity can be used to enhance relevance estimation?
In order to answer these questions, it is necessary to access computational representations of documents and queries.
For example, similarities between documents and queries may correspond to a distance or a divergence defined on the representation space.
It is generally assumed that the quality of the representation has a direct impact on the bias with respect to the true similarity, estimated by means of human intervention.
Building useful representations for documents and queries has always been central to information retrieval research.
The goal of this thesis is to provide new ways of estimating such representations and the relevance relationship between them.
We present four articles that have been published in international conferences and one published in an information retrieval evaluation
forum. The first two articles can be categorized as feature engineering approaches, which transduce a priori knowledge about the domain into the features of the representation.
We present a novel retrieval model that compares favorably to existing models in terms of both theoretical originality and experimental effectiveness.
The remaining two articles mark a significant change in our vision and originate from the widespread interest in deep learning research that took place during the time they were written.
Therefore, they naturally belong to the category of representation learning approaches, also known as feature learning. Differently from previous approaches, the learning model discovers alone the most important features for the task at hand, given a considerable amount of labeled data. We propose to model the semantic relationships between documents and queries and between queries themselves.
The models presented have also shown improved effectiveness on standard test collections. These last articles are amongst the first applications of representation learning with neural networks for information retrieval. This series of research leads to the following observation: future improvements of information retrieval effectiveness has to rely on representation learning techniques instead of manually defining the representation space
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
We investigate the task of building open domain, conversational dialogue
systems based on large dialogue corpora using generative models. Generative
models produce system responses that are autonomously generated word-by-word,
opening up the possibility for realistic, flexible interactions. In support of
this goal, we extend the recently proposed hierarchical recurrent
encoder-decoder neural network to the dialogue domain, and demonstrate that
this model is competitive with state-of-the-art neural language models and
back-off n-gram models. We investigate the limitations of this and similar
approaches, and show how its performance can be improved by bootstrapping the
learning from a larger question-answer pair corpus and from pretrained word
embeddings.Comment: 8 pages with references; Published in AAAI 2016 (Special Track on
Cognitive Systems
Twin Networks: Matching the Future for Sequence Generation
We propose a simple technique for encouraging generative RNNs to plan ahead.
We train a "backward" recurrent network to generate a given sequence in reverse
order, and we encourage states of the forward model to predict cotemporal
states of the backward model. The backward network is used only during
training, and plays no role during sampling or inference. We hypothesize that
our approach eases modeling of long-term dependencies by implicitly forcing the
forward states to hold information about the longer-term future (as contained
in the backward states). We show empirically that our approach achieves 9%
relative improvement for a speech recognition task, and achieves significant
improvement on a COCO caption generation task.Comment: 12 pages, 3 figures, published at ICLR 201
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Sequential data often possesses a hierarchical structure with complex
dependencies between subsequences, such as found between the utterances in a
dialogue. In an effort to model this kind of generative process, we propose a
neural network-based generative architecture, with latent stochastic variables
that span a variable number of time steps. We apply the proposed model to the
task of dialogue response generation and compare it with recent neural network
architectures. We evaluate the model performance through automatic evaluation
metrics and by carrying out a human evaluation. The experiments demonstrate
that our model improves upon recently proposed models and that the latent
variables facilitate the generation of long outputs and maintain the context.Comment: 15 pages, 5 tables, 4 figure